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Abstract 

Background: The GTPase eEF1 A is the eukaryotic factor responsible for the essential, universal function of 
aminoacyl-tRNA delivery to the ribosome. Surprisingly, eEFI A is not universally present in eukaryotes, being 
replaced by the paralog EFL independently in multiple lineages. The driving force behind this unusually frequent 
replacement is poorly understood. 

Results: Through sequence searching of genomic and EST databases, we find a striking association of eEFI A 
replacement by EFL and loss of eEFI A's guanine exchange factor, eEFI Ba, suggesting that EFL is able to 
spontaneously recharge with GTP. Sequence conservation and homology modeling analyses indicate several 
sequence regions that may be responsible for EFL's lack of requirement for eEFI Ba. 

Conclusions: We propose that the unusual pattern of eEFI A, eEFI Ba and EFL presence and absence can be 
explained by a ratchet-like process: if either eEFI A or eEFI Ba diverges beyond functionality in the presence of 
EFL, the system is unable to return to the ancestral, eEFI A:eEFBa-driven state. 
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Background 

EF1A in eukaryotes (eEFIA) and archaea (aEFIA) is a 
highly expressed essential GTPase translation factor. 
Just like its bacterial ortholog, EF-Tu, EF1A delivers 
aminoacyl-tRNA (aa-tRNA) to the ribosome in complex 
with GTP during the elongation stage of translation. 
Accommodation of the aa-tRNA in the ribosomal A 
site induces GTP hydrolysis by EF1A, releasing the 
GDP-bound factor from the ribosome [1]. GDP bound 
to eEFIA needs to be replaced with GTP for the next 
functional cycle to begin. Some translational GTPases such 
as the close relative of EF1A, SelB, dissociate GDP rapidly, 
which leads to spontaneous recharging [2]. However, 
dissociation of GDP from EF-Tu and EF1A is extremely 
slow [3,4] and therefore these GTPases require a dedicated 
guanine exchange factor (GEF) for recharging: EF-Ts in 
bacteria and EF1B in eukaryotes (eEFIB) and archaea 
(aEFIB) [5]. Unlike EF-Ts, eEFIB is a multi-subunit protein, 
with GEF activity residing in the alpha subunit (eEFlBa) 
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[3,6]. The crystal structure of the eEFlA:eEFlBa carboxy 
terminus complex has shed light on the mechanisms of 
exchange at the molecular level, showing which parts of 
eEFIA and eEFlBa interact and how this brings about 
GDP dissociation [7,8]. 

Besides its role in translation, eEFIA has a variety of 
additional, "moonlighting" functions. These include actin 
bundling, nuclear export of aa-tRNAs, proteolysis of 
misfolded proteins, modulating apoptosis, response to 
amino acid starvation and viral replication [9,10]. Being a 
universally essential protein in translation, and an accessory 
protein in a variety of other processes, the discovery of a 
lack of eEFIA in some eukaryotes was unexpected [11]. In 
these eEFlA-lacking organisms, another related factor, EFL 
(for EFlA-like), is present. EFL carries the same domain 
structure as eEFIA and is presumably functionally equiva- 
lent. Most surprising is the broad but discontinuous 
distribution of EFL in eukaryotes and usually mutual 
exclusivity with eEFIA [11,12]. The pattern of presence 
and absence has been explained both by horizontal 
gene transfer (HGT) and co-maintenance and long term 
co-maintenance followed by lineage sorting (some lineages 
losing eEFIA and some losing EFL) [12-17]. Given the 
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absence of strong support for any specific instance of 
HGT of EFL, the possibility that the last common ancestor 
of eukaryotes carried both eEFIA and EFL can not be ruled 
out [12]. However, this would have required millions of 
years of functional redundancy in the co-maintenance of 
eEFIA and EFL without either being lost before the diver- 
gence of all eukaryotic groups that carry EFL. Regardless 
of the mode of descent of EFL, the ancestral state at the 
last common ancestor of eukaryotes and archaea is likely 
to be co-presence of eEFIA and eEFlBa and absence of 
EFL, as aEFIA and aEFIB are found in all archaea, but 
EFL has never been identified in this domain of life. 

Due to a near complete absence of experimental inves- 
tigations of EFL, the evolutionary mechanisms driving 
mutual exclusivity of eEFIA and EFL are poorly understood. 
We hypothesize that the key to this phenomenon lies in 
the differences in the functional cycle of the two proteins. 
It was briefly noted [15] that the GEF eEFlBa has not been 
identified in the genomes of EFL-containing organisms 
Thalassiosira, Chlamydomonas and Ostreococcus, suggest- 
ing that EFL may self-recharge like some other translational 
GTPases, or that a non-homologous GEF may be involved. 
With the increasing number of genomes and large scale 
EST data available for many eukaryotes including those 
that carry EFL, we have conducted a large-scale survey of 
EFL, eEFIA and eEFlBa presence and absence across the 
eukaryotic tree of life. We show a striking association 
of EFL presence with loss of eEFlBa and eEFIA. We 
hypothesise a ratchet-like evolutionary process of reduc- 
tion: if eEFIA or eEFlBa diverges beyond functionality in 
the presence of EFL, the system is unable to return to the 
ancestral, eEFlA:eEFBa-driven state. Whether EFL loss is 
similarly irremediable depends on the rate of HGT of this 
factor. 

Motivated by our hypothesis, we set out to test whether 
EFL can substitute for a loss of eEFlBa in an organism 
possessing eEFIA gene. Similarly to a previous study that 
found Diplomena EFL can not substitute for eEFIA in 
Trypanosoma [18], we find Xk®X Monosiga brevicolis EFL is 
unable to substitute for either double eEFIA and eEFlBa 
or single eEFlBa deletion in Saccharomyces cerevisiae, 
thus suggesting an existence of functional barriers to the 
spread of EFL across eukaryotes. 

Results 

Sequence searching of genomic and EST databases shows 
a striking pattern of presence and absence of EFL, eEFIA 
and eEFlBa (Figure 1 and Additional file 1: Table SI). Both 
EFL and eEFIA are present in all major lineages of eukary- 
otes, and their presence is mostly - but not universally - 
mutually exclusive, as previously reported [11]. Supporting 
the result of Szabova et al [18] that experimentally showed 
eEFIA and EFL can be co-maintained without affecting 
growth, eEFIA and EFL can be co-maintained in a modest 



number of organisms (Thalassiosira pseudonana, Guil- 
lardia theta, Karenia brevis, Symbiodinium sp. and Ulva 
prolifera), although these organisms carry no detectable 
eEFlBa. The gut fungus Basidiobolus ranarum has also 
been found to carry both eEFIA and EFL [19], however 
in the absence of a genome or EST project to search for 
eEFlBa, it was not included in our survey. This co- 
maintenance may reflect incomplete lineage sorting, or 
may be because eEFIA is still required for one of its 
moonlighting functions in these organisms. Importantly, 
we find the loss of eEFIA is almost universally accom- 
panied by parallel loss of eEFlBa, suggesting that EFL 
functioning does not require the assistance of this GEF 
(Figure 1 and Additional file 1: Table SI). In some rare 
cases where eEFIA or eEFlBa are detectable along with 
EFL, degradation of eEFIA or eEFlBa sequence is apparent, 
even in functionally important sites; eEFIA appears to 
be evolving with an apparent loss of selective constraint on 
the protein sequence in Allomyces macrogynus, Aspergillus 
niger, Pseudo-nitzschia multiseries, Fragilariopsis cylindrus, 
Bigelowiella natans, Chlamydomonas reinhardtii and Vol- 
vox carteri, while eEFlBa is in the process of decay in 
Guillardia theta and Pythium ultimum. Thecamonas tra- 
hens on the other hand encodes a divergent EFL along with 
eEFIA and an apparent eEFlB-kinase protein fusion 
(Figure 1 and Additional file 1: Table SI). Thus, in our 
snapshot of eukaryotic history, we have captured multiple 
ongoing cases of gene degradation towards loss. Overall, 
the association of EFL presence with eEFIA and eEFlBa 
loss is statistically significant whether the divergent se- 
quences are considered as present or absent (p <0.00001 
in both cases). 

Phylogenetic analysis of the EFL sequences detected in 
this study gives a tree that is overall similar to other 
published EFL phylogenies (for example [13,14,16,20]), al- 
though with fewer taxa as we only considered organisms 
with large scale EST and whole genome data available 
(Additional file 2: Figure SI A). Even with additional 
sequences from PCR amplification and sequencing of 
individual genes, phylogenetic analysis of EFL does not 
shed much light on the origin and deep evolutionary 
history of EFL; the deepest branches in the phylogenetic 
tree lack strong statistical support in ML and/or Bayesian 
analyses, and the EFL tree can not be rooted reliably due 
to long branch attraction of divergent sequences to the 
outgroup [20]. Therefore, the path to EFL replacement of 
eEFIA (HGT or co-maintenance with differential loss) 
is hard to determine, as also found with probabilistic 
models of EFL gain and loss [21]. However, we do see 
strong bootstrap and posterior probability support for 
some taxonomic assemblages within the EFL tree, for 
example Dinophyceae (89% MLBP (maximum likelihood 
bootstrap percentage) and 1.0 BIPP (Bayesian inference 
posterior probability)), Rhodophyta (99% MLBP, 1.0 BIPP), 



Atkinson et al. BMC Evolutionary Biology 2014, 14:35 
http://www.biomedcentral.eom/1 471 -21 48/1 4/35 



Page 3 of 9 



-O * 

• ?i ^ O ' N O > W U ^ ^ f ^ < 

* " * I 3 I ! I I I |l JP 1 « T .* * 



k • 1 



• EFL 

EFL (divergent) 

• eEF1A 

eEF1 A (divergent) 

• eEF1B 

eEF1B (divergent) 



Holozoa 

Apusozoa 

Amoebozoa 

Cryptophyta 

Excavata 

Haptophyceae 

Stramenopiles 

Alveolata 



• He// 



• Ostrei 




J//a var/ab/7/s' 
* Pr ototheca wickerhamii 

• Micromonas sp. RCC299 

• Ostreococcus 'ucimarinus 
■eococcus sp.R aur4 



A^ s 



.coccus 



Hartmannella vermiformis 
■ Guillardia theta ••• 
' Rhodomonas salina • 




<f £ 3 £ ° i. 3 3 3 

* • W -J • • 

• • • 



^ • • • 
• • • • 



Figure 1 Cladogram showing presence and absence of elongation factors across eukaryotes. The tree summarizes current knowledge of 
the taxonomic grouping of the species considered here. Polytomies are present where branching order is unknown or contentious. Colored 
shading behind branches indicates major lineages, as per the color key in the top left. Circles show presence and absence of intact (opaque) or 
degraded (semi-transparent) elongation factors, with colors indicating factor identity according to the top left key. 



fungi, excluding Conidiobolus cornatus (99% MLBP, 1.0 
BIPP), stramenopiles (Heterokontophyta + Oomycetes, 
100% MLBP, 1.0 BIPP) and members of Choanoflagel- 
latea + Ichthyosporea (100% MLBP, 1.0 BIPP). Dinophyceae 
and Rhodophyta are all EFL encoding, suggesting that EFL 
was vertically inherited within these groups. Surprisingly, 
EST evidence suggests eEFIA has not been completely lost 
in Dinophyceae, with eEFIA ESTs being detected in 
Symbiodinium sp. and Karenina brevis (Figure 1 and 
Additional file 1: Table SI). In the case of Choanoflagel- 
latea, Ichthyosporea, stramenopiles and fungi, some species 
encode EFL while some encode eEFIA. It is unclear 
whether co-maintenance and differential loss alone is 
responsible for this distribution, or whether HGT has 
been involved. Whatever the source of EFL in these 
taxa, lineage sorting appears incomplete, with eEFIA 
sequence relics being detected in some EFL-encoding 
stramenopiles and fungi. The backbone of the eEFIA tree is 
poorly resolved and multiple paralogs of eEFIA are appar- 
ent (Additional file 2: Figure SIB). Some of these are highly 
divergent, such as those from Tetrahymena thermophila 



and Paramecium tetraurelia, which are attracted to the 
degrading eEFlAs in organisms encoding EFL (Additional 
file 2: Figure SIB). 

Comparison of patterns of evolution across sites using 
a consensus sequence alignment of eEFIA versus EFL 
(Figure 2) shows differentially conserved sites across all 
domains (G domain, domain II and domain III), with 
some such sites clustering together. To determine how 
the sequence changes affect the structural contacts of 
EFL, a homology model was made of Chlamydomonas 
incerta EFL using the X-ray crystal structure of S. cerevisiae 
eEFIA in complex with eEFlBa as the template (PDB ID 
HJE) [8]. 

Many of the eEFIA residues that interact with eEFlBa 
overlap with regions important for nucleotide or aa-tRNA 
binding (Figure 2). Therefore, it is unsurprising that these 
multifunctional regions are well conserved in EFL. How- 
ever, regions of eEFIA that are apparently specialized 
for eEFlBa binding are often very different in EFL. The 
GTPase (G) domain is overall well conserved, particularly 
in the nucleotide binding loops, with less conservation 
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Figure 2 Consensus and example sequence alignment of eEFI A and EFL. Consensus sequences aligned with example sequences for eEF1 A 
and EFL were calculated at the 70% level using the Python program Consensus Finder [34]. Ruler coloring indicates the boundaries of the three 
domains. Shading behind residues shows conservation patterns: turquoise - strongly differentially conserved sites; blue - sites conserved in EFL 
but not eEF1A; green - sites conserved in eEF1A and not in EFL. Red boxes indicate the location of the nucleotide binding motifs of the G 
domain. Colored lines beneath the alignment indicate structural features as follows: orange - eEFIBa interacting sites; blue - residues lining the 
amino-acyl moiety binding pocket; green - the extended loop of the helix-loop-helix on the ribosome binding surface. 
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seen in the exposed loops in between. One of the most 
striking differences between eEFIA and EFL in this do- 
main is a six amino acid-long strongly differentially 
conserved patch, Figure 2 coordinates 74-79, (consensus 
sequences aCTTKA in EFL and DIALWK in eEFIA) that 
is located in a helix between the G2 and G3 nucleotide 
binding motifs. The DIALWK motif is part of a loop of 
eEFIA that directly interacts with eEFIBa through hydro- 
gen bonds (Figure 3 A and B) [7]. The conformation of this 
loop is also stabilized by another strongly differentially 
conserved site: alignment position 35 (D35 in eEFIA and 
P35 in EFL, Figure 2). This suggests that the striking se- 
quence differences in these residues are directly related to 
EFLs apparent lack of requirement for eEFIBa. Another 
differentially conserved residue of this domain that in 
eEFIA may interact with eEFIBa is in position 106 in 
Figure 2 (T106 in eEFIA and A106 in EFL, Figure 3A). 

Three of the five sequence insertions in EFL relative to 
eEFIA (Insl-5, Figure 2) are found in the G domain. 
Structural alignment of EFL with the structure of eEFI As 
bacterial ortholog EF-Tu on the ribosome [22] suggests 
Insl and Ins2 are exposed with no obvious ribosomal 
interaction partners, but Ins3 extends a helix-loop-helix 
structural element on the ribosome-binding face (Figure 3C). 
This insertion is also interesting as it overlaps with a 12 
amino acid insertion in opisthokont eEFIA [23]. However, 
the sequence alignment of these two insertions relative to 
each other is ambiguous, and thus there is no evidence 
that the Ins3 insertion is homologous to the animal/fungal 
insertion. A single conserved amino acid deletion in EFL 



relative to eEFIA is also apparent, but is found in an 
exposed loop of eEFIA that is poorly conserved (position 
120 in Figure 2). 

In domain II, very strong conservation between eEFIA 
and EFL is seen in the regions that in bacterial EF-Tu 
form the pocket for accommodating the aminoacyl moiety 
of aa-tRNA [22] (Figure 2). Differentially conserved sites 
in this domain are largely in exposed loops, however there 
are three residues that in eEFIA are positioned to poten- 
tially interact with eEFIBa and are strongly conserved as 
chemically different amino acids in EFL: alignment coordi- 
nates 266-267 and 272, corresponding to Q249, D250 and 
G255 in eEFIA; respectively S264, G265, K270 in EFL 
(Figures 2 and 3A). 

In domain III, differentially conserved sites are mostly 
dispersed and largely exposed. However, two differentially 
conserved regions are positioned to be involved in the 
eEFlA:eEFlBa interaction (Figures 2 and 3A): firstly 
DCHTAHI in eEFIA, which is FVR.GRs in EFL (starting 
at position 383 in Figure 2, 360 in S. cerevisiae eEFIA 
and 381 in C. incerta EFL), and secondly amino acids GN 
in EFL, conserved as MR in eEFIA starting at position 
450 in Figure 2, 427 in S. cerevisiae eEFIA and 448 in C. 
incerta EFL). eEFIA is extended in sequence at the ex- 
treme C terminus by an average of 17 amino acids, 
relative to EFL (Figure 2). The amino acid contacts in 
this region are unknown as they are not present in the 
crystal structure. 

Mutants of eEFIA in S. cerevisiae have previously been 
shown to confer independence from eEFIBa: R164K, T22S, 
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(See figure on previous page.) 

Figure 3 Homology model of EFL in complex with eEFIBa. The homology model on EFL based on the structure of eEF1A in complex with 
eEFIBa (red ribbon) is shown A) in cartoon form, B) as a surface, showing the exposed face when in complex with the ribosome, and C) as a 
surface showing the ribosome binding face. In all panels, turquoise coloring shows strongly differentially conserved sites as per the alignment in 
Figure 2, and blue parts of the structure show conserved insertions in EFL relative to eEF1A. Residues shown as turquoise sticks are those 
differentially conserved sites that in eEF1A interact with eEFIBa. 



A112T, and A117V [24] (Figure 2). These are mostly 
located in the G domain and surprisingly, only two differ 
in conservation between eEFIA and EFL: A112 is uncon- 
sented in EFL, and A177 is differentially conserved as 
PI 17 in EFL. This suggests there are multiple routes to 
independence from eEFIBa, although the single amino 
acid replacement routes may be highly species-specific 
in their effectiveness. There have been no cases of natural 
eEFlBa-free eEFlAs reported, and our distribution analysis 
suggests this would be rare. 

Biochemical experimentation demonstrating rapid self- 
recharging of EFL:GDP with GTP would be the unequivocal 
proof of our hypothesis that EFL can functionally substitute 
for eEFIA and eEFIBa loss. However, all our attempts to 
overexpress EFL in E. coli have failed (data not shown). An 
alternative strategy is to perform in vivo complementation 
experiments by replacing either both eEFIA and eEFIBa 
or just eEFIBa with EFL. Using S. cerevisiae as a model 
organism, we generated strains with controlled expression 
of eEFIA, eEFIBa and M. brevicollis EFL (see Additional 
file 3: Materials and methods). Removal of saver plasmids 
expressing either both eEFIBa- or eEFIA or just eEFIBa 
by addition of 5-Fluoroorotic acid (5-FoA) resulted in loss 
of viability that was not rescued by expression of M. brevi- 
collis EFL (Additional file 4: Figure S2), indicating that in 
S. cerevisiae, EFL does not seem to be able to complement 
the loss of either eEFIBa alone or eEFIBa and both genes 
encoding eEFIA in S. cerevisiae (TEF1 and TEF2 [25]). 

Discussion 

The pattern of presence and absence of EFL, eEFIA and 
eEFIBa allows us to derive a ratchet-like model for the 
evolutionary dynamics of elongation factor gain and loss 
in terms of the viable and inviable fates of different com- 
binations of elongation factors (Figure 4). It is likely is 
that EFL arose by gene duplication after the last common 
ancestor of eukaryotes and archaea, which encoded eEFIA 
and eEFIBa. Degradation of eEFIA or eEFIBa though 
random genetic drift in the presence of EFL is likely to be 
an almost irreversible step that acts as the pawl of the 
ratchet: a return to the ancestral state would require that 
the lost gene is quickly re-transferred before its binding 
partner diverges beyond preventing functional interaction: 
an unlikely scenario. 

The ratchet may also work in the reverse direction, i.e. 
once EFL is lost from an EFL + eEFIA + eEFlBa-encoding 
organism there is no going back. This depends on how 



frequently, if at all, EFL is transmitted by HGT, which is 
currently unclear [12-17]. In the absence of EFL HGT, the 
ratchet is nonprocessive and acts merely as a lineage- 
sorting evolutionary mechanism, degrading EFL + eEFIA + 
eEFlBa-encoding organisms into either EFL- or eEFIA + 
eEFlBa-encoding. Repetitive re-introduction of EFL by 
HGT into an eEFIA + eEFIBa background would make 
the ratchet processive, potentially leading to an enrichment 
of EFL-containing organisms. Given the uncertainty in EFL 
HGT rates, it is impossible to assess the processivity of the 
ratchet. In the most extreme case of nonprocessivity, EFL 
would not have been subject to HGT at all, would have 
been present in the eukaryotic ancestor, and then all three 
genes would have been maintained for millions of years 
before the divergence of all modern EFL-encoding groups 
of organisms. 

The relative rates of transitioning between states of 
factor composition is likely to be highly species specific; 
while EFL is clearly capable of replacing eEFIA in multiple 
lineages, it has been experimentally shown that Diplonema 
EFL can be co-maintained with, but can not replace eEFIA 
in Trypanosoma [18]. Our results suggest the same is true 
for replacement of either both eEFIA and eEFIBa or just 
eEFIBa with M. brevicollis EFL in S. cerevisiae (Additional 
file 4: Figure S2). EFL is not naturally found in any yeasts, 
which may reflect an irreplaceability of yeast eEFIA, or 
may be because successful HGT is rare in this group of 
organisms [26]. 

The stability of the intermediate states (EFL in combin- 
ation with eEFIA and/or eEFIBa) depends on organism- 
specific constraints such as multifunctionalisation (which 
may drive the system towards co-maintenance of both 
paralogs), and evolutionary selection for genome reduction 
(which could increase the rate of loss). The rare cases of 
dual maintenance may be driven by multifunctionality 
of eEFIA [12]. In fact it is surprising that eEFIA is not 
maintained in parallel to EFL more often given its plethora 
of "moonlighting functions". One explanation could be 
that eEFIA is not universally multifunctional, or its add- 
itional functions do not provide enough selective advan- 
tage for its maintenance. It is also possible that some of 
the moonlighting functions could be carried out by EFL, 
or by one of the other two closely related paralogs of 
eEFIA, eRF3 or Hbslp. Indeed, eRF3 and Hbslp have 
already taken over eEFlAs additional ancestral functions 
in translation termination and mRNA decay via eRFl and 
Dom34p binding [27,28]. 
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Figure 4 Evolutionary dynamics of elongation factors. The model shows the possible and impossible combinations of elongation factors EFL 
(turquoise) eEF1A (purple), eEFIBa (red) following gene acquisition and loss. Impossible scenarios (those that would be fatal for the organism) are 
indicated with a skull and crossbones. Other notation is explained in the inset box. 



Our sequence conservation and homology modeling 
analyses indicate several sequence regions that may be 
responsible for EFLs lack of requirement for eEFIBa. 
However, we can not rule out the possibility that EFL 
has hijacked another GEF for recharging. One candidate 
could be eIF2a, the GEF for an ancient paralog of eEFIA 
and EFL, eIF2x« Although homology is not apparent 
at the sequence level, eIF2a is structurally similar to 
eEFIBa [29]. 

The ratchet mechanism of elongation factor replacement 
relies only on random genetic drift and can explain how 
eEFIA can be efficiently replaced by EFL without the 
need of the latter being a "better" elongation factor, i.e. 
providing a selective advantage in itself. The ability of 
EFL to recharge without a specialized accessory factor 
does not in itself make it an improved enzyme; the impact 
of this reduction on the functional cycle of the elongation 
factor is unknown. 



Conclusions 

The genomic distribution of the guanine exchange fac- 
tor eEFIBa considered alongside that of EFL and eEFIA 
gives a very strong indication that EFL is able to re- 
charge without this exchange factor. Thus, the presence 
of EFL has apparently allowed the decay and loss of 
both eEFIA and eEFIBa in some lineages of eukaryotes, 
a ratchet-like process where return to the ancestral state 
is unlikely. Horizontal transmission of EFL has been 
proposed among eukaryotes, however current sequence 
data is inadequate for determining the rate of transfer, 
and indeed if it occurs at all. Additional sequencing 
efforts are required to more fully resolve the dynamics 
of EFL through the evolutionary history of eukaryotes. 
Further in vitro and in vivo experimentation is also 
required to answer the question of whether EFL self- 
recharges or whether exchange is promoted by another 
factor. 
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Methods 

BLAST searches were carried out at the JGI (http://gen- 
ome.jgi.doe.gov/), NCBI (http://ncbi.nlm.nih.gov/), Origin 
of Multicellularity [30], GeneDB [31] and Cyanidioschyzon 
merolae genome (http://merolae.biols.u-tokyo.ac.jp/) data- 
base webpages, using M. brevicolis EFL, Monosiga sp. 
(ATCC 50635) eEFIA and S. cerevisiae eEFlBa as queries. 
The BLASTp method was used where protein models 
were available; otherwise tBLASTn was used to search 
protein against translated genomic and EST nucleotide 
sequences. Nucleotide hits were translated into protein 
using Transeq at the EBI (http://www.ebi.ac.uk/Tools/). 
The E value limit was set to le-5 and sequences found 
with eEFlBa were checked with Pfam to confirm identity 
based on the presence of the EF1_GNE (EF1 guanine nu- 
cleotide exchange) domain [32]. eEFlBa has been subject 
to gene duplication in some lineages resulting in paralogs 
such as eEFlB5 in Metazoa and eEFlBp in plants. As we 
are interested in presence or absence of a detectable 
eEFlBa homolog and not the complete family tree of 
this protein family, which has been addressed elsewhere 
[5], only the top hit was retained. In house translational 
GTPase datasets (GCA) were used for classification of 
EFL and eEFIA sequences. 

Sequences were aligned using MAFFT v6.864b with the 
L-ins-i strategy [33]. Consensus sequences were generated 
with the Consensus Finder Python script [34]. Only full 
length, non-degrading sequences or identical/nearly identi- 
cal duplicates from the same organism were included in the 
data set. The threshold conservation level was set to 70%. 

For phylogenetic analyses of eEFIA and EFL, gap-rich 
ambiguous alignment regions were identified by eye and 
removed. Extremely truncated sequences typical of ESTs 
were removed to minimize the amount of missing data. 
This resulted in dataset dimensions of 462 aligned amino 
acid positions from 72 sequences for EFL, and 446 posi- 
tions from 82 sequences for eEFIA. Phylogenetic analyses 
were carried out with RAxML [35] and MrBayes [36] on 
the CIPRES Science Gateway v3.2 [37]. MrBayes was run 
with a mixed model plus the gamma rate distribution, 
with the program converging on the WAG model (1.0 
posterior probability) in the case of EFL, and RTREV (0.99 
posterior probability) in the case of eEFIA. Two independ- 
ent runs of 4 chains were run for 2 million generations, 
sampling every 1000 generations. A consensus tree was 
generated after a burn in of 200000 generations. At the end 
of the runs, the standard deviations of split frequencies 
(SDSF) were 0.01 in the case of EFL and 0.1 in the case of 
eEFIA. RAxML was run taking into account the MrBayes 
model selection, with the WAG + CAT model for EFL 
and RTREV + CAT model for eEFIA with 100 bootstrap 
replicates in each case. 

A structural homology model of EFL was generated 
using Swiss-Model [38] with the crystal structure of the 



eEFlA:eEFlBa complex (PDB ID 1IJE, [8]) as the structural 
template. Using MacPyMOL (www.pymol.org), the EFL 
model was structurally aligned with the crystal structure 
of EF-Tu on the ribosome (PDB IDs 2WRN and 2WRO) 
[22] in order to indicate likely ribosome and aa-tRNA 
binding surfaces. 

Additional files 



Additional file 1: Table SI. Presence and absence of EFL eEFIA, 
eEFlBa with sequence ID numbers. Sequences are ordered by taxonomy. 
Paler colours indicate that a sequence is present, but highly divergent. 

Additional file 2: Figure SI. Phylogenies of eEFIA and EFL. See 
Additional file 3 for legend. 

Additional file 3: Supplementary Methods, full legends for figures 
S1 and S2, and Table S2 of strains and plasmids used in the study. 

Additional file 4: Figure S2. Both S. cerevisiae double eEFlBa and 
eEFIA knock-out and eEFlBa single knock-out are not complemented 
in vivo by Monosiga brevicollis EFL. See Additional file 3 for legend. 
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